NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Data-Efficient Policy Evaluation Through Behavior Policy Search

Hanna, Josiah P; Chandak, Yash; Thomas, Philip S; White, Martha; Stone, Peter; Niekum, Scott (October 2024, Journal of machine learning research)
Ravikumar, Pradeep (Ed.)
We consider the task of evaluating a policy for a Markov decision process (MDP). The standard unbiased technique for evaluating a policy is to deploy the policy and observe its performance. We show that the data collected from deploying a different policy, commonly called the behavior policy, can be used to produce unbiased estimates with lower mean squared error than this standard technique. We derive an analytic expression for a minimal variance behavior policy -- a behavior policy that minimizes the mean squared error of the resulting estimates. Because this expression depends on terms that are unknown in practice, we propose a novel policy evaluation sub-problem, behavior policy search: searching for a behavior policy that reduces mean squared error. We present two behavior policy search algorithms and empirically demonstrate their effectiveness in lowering the mean squared error of policy performance estimates.
more » « less
Full Text Available
Is Ockham’s razor losing its edge? New perspectives on the principle of model parsimony

https://doi.org/10.1073/pnas.2401230121

Dubova, Marina; Chandramouli, Suyog; Gigerenzer, Gerd; Grünwald, Peter; Holmes, William; Lombrozo, Tania; Marelli, Marco; Musslick, Sebastian; Nicenboim, Bruno; Ross, Lauren N; et al (February 2025, Proceedings of the National Academy of Sciences)

The preference for simple explanations, known as the parsimony principle, has long guided the development of scientific theories, hypotheses, and models. Yet recent years have seen a number of successes in employing highly complex models for scientific inquiry (e.g., for 3D protein folding or climate forecasting). In this paper, we reexamine the parsimony principle in light of these scientific and technological advancements. We review recent developments, including the surprising benefits of modeling with more parameters than data, the increasing appreciation of the context-sensitivity of data and misspecification of scientific models, and the development of new modeling tools. By integrating these insights, we reassess the utility of parsimony as a proxy for desirable model traits, such as predictive accuracy, interpretability, effectiveness in guiding new research, and resource efficiency. We conclude that more complex models are sometimes essential for scientific progress, and discuss the ways in which parsimony and complexity can play complementary roles in scientific modeling practice.
more » « less
Full Text Available
Towards Safe Policy Improvement for Non-Stationary MDPs

Chandak, Yash; Jordan, Scott; Theocharous, Georgios; White, Martha; Thomas, Philip (January 2020, Advances in neural information processing systems)
null (Ed.)
Full Text Available
The Utility of Sparse Representations for Control in Reinforcement Learning

Liu, Vincent; Kumaraswamy, Raksha; Le, Lei; White, Martha (January 2019, AAAI Conference on Artificial Intelligence)

Full Text Available

Search for: All records